A partition-based algorithm for clustering large-scale software systems

نویسندگان

چکیده مقاله:

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawbacks such as finiteness criterion and arbitrary decisions occurred in the process. Because of the NP-hardness of clustering software systems, evolutionary and search-based algorithms are more commonly used algorithm than hierarchical ones. In evolutionary algorithms, the clustering of software systems is considered as a problem of searching over some possible clustering candidates. Although these algorithms are often able to achieve an appropriate structure of the software, they are not applicable in clustering large-scale software. Furthermore, these algorithms are unable to consider the knowledge in the artifact dependency graph, which extracted from the source code of the software. In software systems, an artifact can be everything like a class, a function, or a file. In this paper, a new partition-based clustering algorithm is presented. This algorithm attempts to partition the artifact dependency graph considering the knowledge therein. Moreover, a new distance criterion is presented to measure the similarity and dissimilarity of the artifacts. The proposed algorithm starts with the artifact dependency graph and creates the similarity matrices of the artifacts. So, it attempts to refine the partition candidate until a fixed point is reached. We expect that the proposed method compared with other methods could lead to achieve the clustering with high quality and similar to the expert's clustering based on MoJo-FM measure. To demonstrate the applicability and validity of the proposed algorithm, a large-scale case study, Mozilla Firefox, is employed. The results demonstrate that the proposed algorithm outperforms the commonly used evolutionary methods in the literature.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Three-terms Conjugate Gradient Algorithm for Solving Large-Scale Systems of Nonlinear Equations

Nonlinear conjugate gradient method is well known in solving large-scale unconstrained optimization problems due to it’s low storage requirement and simple to implement. Research activities on it’s application to handle higher dimensional systems of nonlinear equations are just beginning. This paper presents a Threeterm Conjugate Gradient algorithm for solving Large-Scale systems of nonlinear e...

متن کامل

A Variable Structure Observer Based Control Design for a Class of Large scale MIMO Nonlinear Systems

This paper fully discusses how to design an observer based decentralized fuzzy adaptive controller for a class of large scale multivariable non-canonical nonlinear systems with unknown functions of subsystems’ states. On-line tuning mechanisms to adjust both the parameters of the direct adaptive controller and observer that guarantee the ultimately boundedness of both the tracking error and tha...

متن کامل

Clustering Algorithm for Large-Scale Databases

Clustering systems can discover intentional structures in data and extract new knowledge from a database. Many incremental and non-incremental clustering algorithms have been proposed, but they have some problems. Incremental algorithms work very efficiently, but their performance is strongly affected by the input order of instances. On the other hand, non-incremental algorithms are independent...

متن کامل

An adaptive modified firefly algorithm to unit commitment problem for large-scale power systems

Unit commitment (UC) problem tries to schedule output power of generation units to meet the system demand for the next several hours at minimum cost. UC adds a time dimension to the economic dispatch problem with the additional choice of turning generators to be on or off.  In this paper, in order to improve both the exploitation and exploration abilities of the firefly algorithm (FA), a new mo...

متن کامل

Moving-horizon partition-based state estimation of large-scale systems

This report presents three Moving Horizon Estimation (MHE) methods for discrete-time partitioned linear systems, i.e. systems decomposedinto coupled subsystems with non-overlapping states. The MHE approach is used due to its capability of exploiting physical constraints onstates in the estimation process. In the proposed algorithms, each subsystem solves reduced-order MHE problems t...

متن کامل

A Partition-Based Efficient Algorithm for Large Scale Multiple-Strings Matching

Filtering procedure plays an important role in the Internet security and information retrieval fields, and usually employs multiple-strings matching algorithm as its key part. All the classical matching algorithms, however, perform poorly when the number of the keywords exceeds 5000, which made large scale multiple-strings matching problem a great challenge. Based on the observation that the sp...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}


عنوان ژورنال

دوره 18  شماره 4

صفحات  37- 48

تاریخ انتشار 2022-03

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

کلمات کلیدی

کلمات کلیدی برای این مقاله ارائه نشده است

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023